When law enforcement officers talk about the "most effective"caliber bullet or the "best" combat handgun on the street,emotions run high and opinions vary. This can be expected, sincethese topics have caused considerable debate for years.
But what of the firearms expert who is tasked with theresponsibility of selecting ammunition and firearms for adepartment? What are the crucial issues that should beconsidered? Where should testing begin? What needs to beaddressed in order to conduct a fair and impartial ammunition andfirearms selection program?
The FBI Academy's Institutional Research and Development Unit(IRDU) provides consultation primarily to the FBI's TrainingDivision personnel regarding research methodology, evaluationand statistical analysis. This article provides an introductionto research design and statistical analysis with regard toammunition selection. It is intended to assist firearmspersonnel in designing an ammunition research project andanalyzing the results.
The topics addressed include (1) research design, (2) criteriafor selecting ammunition, (3) rater bias, and (4) statisticalanalyses. Throughout the article emphasis is placed onunderstanding the logic of the various elements of a researchproject.
DESIGN OF THE RESEARCH
Kerlinger, a research methodologist, indicates that researchdesign is the structure, plan or strategy developed to obtainresults from a research project. "Research designs are inventedto enable the researcher to answer research questions as validly,objectively, accurately, and economically as possible."(1)
In designing any ammunition selection study, the first step is todetermine the comparisons to be made. For example, is thepurpose of the study to compare the same caliber bulletperformance for ammunition made by different companies or to
compare the performance of the same caliber bullet in handgunsproduced by different manufacturers?
The following research design is used throughout this article asa convenient example; three different calibers are compared onperformance measures of penetration, expansion and weight in avariety of target simulants (targets). Examples of targets aregelatin blocks to simulate human tissue, sheets of metal toresemble the properties of an automobile door, automobilewindshield glass held at a given angle, and so on.
"Internal validity" and "external validity" are two majorcriteria by which any research design is judged. Internalvalidity, for the example shown above, is the extent to whichdifferences in penetration, expansion and weight can beattributed to differences in the physical characteristics of thecalibers rather than to other influences or conditions. Externalvalidity is the extent to which similar differences inperformance would generalize to other ammunition, conditions orsettings. The ideal would be to maximize both internal andexternal validity. However, the importance of maximizinginternal validity, that is, controlling unwanted influences, by
necessity, often limits external validity.
Internal Validity
Internal validity is extremely important in any ammunitionselection study; if the research is internally valid, then thereis a high probability that the differences in caliber performanceare caused by the different sizes of the calibers. Internalvalidity is synonymous with control over unwanted influences. For ammunition selection studies, the unwanted influences thatmust be controlled or held constant would include environmentalconditions, physical/human conditions, and target simulants.
Environmental conditions-In an indoor range, environmentalconditions for firing ammunition can be easily controlled. Shooting should take place where temperature, weather, light andnoise are kept fairly constant. Without an indoor range, keepingthese conditions constant is extremely difficult.
Physical/human conditions⌐⌐Many other physical and humaninfluences can affect a study. Some of these influences can bedetermined; others cannot. The best way to control unwantedinfluences is to simultaneously set up test barrels, one for eachcaliber to be tested, and randomly determine the order in whichthe test barrels are fired. (A table of random numbers can beused to determine the order.) For example, a researcher whofires one caliber all morning and then fires a different caliberthroughout the afternoon might have measurements influenced bythe fatigue of late afternoon shooting and thereforeunintentionally record measurement results favoring the calibershot in the morning.
Other variables are not controlled by random ordering for firingthe different calibers. For example, if test barrels are not ofequal length, firing them in random order would not compensatefor these differences. Using test barrels of unequal length willaffect not only the velocity but also the extent of penetration. Therefore, if unequal length test barrels are used, additionalresearch is necessary to determine the öextentò of the differencesamong the calibers tested, which adds greatly to the complexityof the research.
Targets-Whether one type of target or a variety of targets areused in the study, controlling the variations in theconstruction of these targets is critical and can be done byrandomly distributing targets (again using the random numberstable) of a given type across calibers. For example, if a batchof gelatin blocks is not mixed thoroughly and blocks with greaterdensity are used with only one caliber, then any differences inpenetration, expansion or weight for the different calibers couldbe partially or fully caused by the consistency of the gelatinblocks.
Since gelatin blocks are used both as stand-alones and behindother targets, two other controls are suggested. First, becausegelatin blocks can deteriorate easily, care must be taken topreserve their integrity. Gelatin blocks should be stored ininsulated coolers prior to use and should be checked by measuringtheir temperature before being used for targets. Second, analready-penetrated gelatin block should not be used again as atarget. The trauma from the first round's impact may disturbthe consistency of the gelatin and affect the measurement ofpenetration from later rounds fired into it.
External Validity
After maximizing internal validity, the reseaercher must alsoplan for external validity so that the results can be generalizedbeyond the bullets used in the study. There are many conditionsunder which results may be generalized; no study can accomplishall of them. However, it's important to know what theseconditions are since the generalizations that cannot be made setthe limitations of the study.
External validity is the extent to which any difference inperformance among the calibers can be generalized to (1) a largerpopulation, such as other lots of ammunition of the same calibermade by the same manufacturer; (2) different populations, such asother ammunition of the same caliber made by differentmanufacturers; (3) "real-life" targets that the study targetspurport to "simulate"; and (4) other conditions and settings.
How can a researcher determine if the results of a study can begeneralized to a larger population of other same caliber bulletsfrom the same manufacturer? If the bullets in a study are arandom sample from this larger population of bullets, the
bullets are representative of that population. This means thatany sample of the same caliber bullets from this population canbe expected to produce similar results.
How can the results be generalized to other conditions orsettings? One way is to build important conditions into theresearch design. When the study at the beginning of this articlewas designed to compare the performance of different calibers ina variety of targets, we decided to see if performance resultswould generalize over the different target types. If aparticular caliber shows superior performance, will this occur inall targets in the study? Some of the targets?
No one study can provide answers to all the questions that can begenerated around a particular research question. Often, logicand expert judgment must be used to provide some tentativeanswers as to whether the results will generalize to the samecalibers made by other manufacturers and to other conditions andsettings. Will the same results be obtained in actual automobiledoors as in simulated targets? Will the same results hold inextreme temperature as in an indoor range? If it is importantto answer these questions with confidence, the best procedure isto carry out a series of studies that vary the importantconditions and settings to determine the extent of thegeneralization over conditions.
CRITERIA FOR AMMUNITION SELECTION
The criteria we are using to determine the most effective bulletare performance measures linked to adversary incapacitation. These performance measures are penetration, expansion and weight.
Reliable and Valid Measurements-Whenever any measurement istaken, whether it is a blood pressure test, an achievement testor measurement of bullet performance, it is important to know howreliable and valid these measurements are. Reliability refers toconsistency of measurement; for example, it is the extent towhich two raters measuring penetration for a given round obtainsimilar results. Validity refers to the accuracy of measurement;biased measurements can occur if the measurement of penetrationfor one of the calibers is consistently too high or too low.
Reliability and validity can affect the results of a study. Ifmeasurement is unreliable, i.e., if the measurement was takenwith a ruler made of very flexible rubber, it will be moredifficult to find true differences among the calibers. If ameasurement is biased for one caliber but not another, theresults may show differences that are not true differences.
A New Measurement Procedure-Of the three criteria forammunition selection, the measurement of a round's penetrationinto a gelatin block seems to have the most potential forreliability and validity problems. The traditional method of
measuring wound tracks in ballistic gelatin is to view the trackthrough the surface of the gelatin block and measure the channelfrom bullet entry to the end of the "bounce back" with a tapemeasure or ruler. We call this method of measuring penetration"topical measurement."
There are two potential problems with the traditionalmeasurement of penetration. The first problem centers onreliability of the measurement. Would optical/light refractionthrough the gelatin block result in inconsistent (moreunreliable) results when penetration was measured topically? Thesecond problem centers on the accuracy of the measurement. Isthere sufficient curvature in some of the wound tracks thatdifferential results would occur if a more accurate (valid)measure of the wound track were applied?
In our work in ammunition selection, these problems have beenaddressed by measuring each "wound track" by two differentraters using two different methods. First, measurements weretaken topically using a locking metal tape measure. Then, amedical urethral catheter was used to measure the wound trackinternally up to the back of the resting bullet. The totalcatheter measurement was the internal measurement added to atopical measurement from the back of the bullet up to andincluding "bounce-back." For each round fired, two ratersmeasured penetration both topically and with the catheter.
Both topical and catheter procedures were highly reliable whenthe measurements of the two raters were compared. In examiningthe validity of the two procedures, we found that the heaviestcaliber studied showed more curvature than the lightest caliber. The average curvature for the heaviest caliber was almostone-third of an inch, with the largest recorded curvature of overone-half inch. Therefore, if curvature is expected, it isprobably best to use the catheter method of measuringpenetration.
RATER BIAS
Rater bias can occur in ammunition selection research when theresearchers themselves (raters) are measuring penetration,expansion and weight. Under these conditions it is necessary toguard against conscious or unconscious biases of the researcherswho may favor a specific caliber. However, favoring a specificcaliber should not prevent individuals from being active in aresearch project. Rather, controls must be built into theresearch that prevent conscious or unconscious biases fromaffecting the results.
The usual procedure for eliminating rater bias is to keep theraters "blind," that is, prevent those who take the penetration,expansion and weight measurements from knowing which caliber isbeing fired. In ammunition selection studies, firearms expertsare often employed as researchers to select the most effective
bullet. These experts can, for the most part, immediatelydetermine bullet caliber from bullet performance; it isimpossible to keep them "blind." To get around this problem,staff members not familiar with firearms can be taught to take penetration, expansion and weight measurements. Using blindraters will add much credibility to a research project.
STATISTICAL ANALYSES
When statistical inference tests are used in making decisionsabout results, the question being asked is, "Did the differencesamong the calibers happen by chance or are they truedifferences?" A statistically significant result is interpretedto mean that the probability of the differences among thecalibers being due to chance is very small.
Ammunition and firearms experts may find it useful to call uponexperts in research methodology and statistics to makerecommendations concerning the design of the study, sample size,procedures and statistical analyses. Oftentimes, it is possibleto use a graduate student in research methods and/or appliedstatistics at a local university to assist in research projects.
Conditions That Influence Statistical Tests
Several conditions influence whether results of performancetests are statistically significant. Two of the most importantinfluences are the size of the sample and the variability of thedata. In general, the larger the sample size (the number of testbullets fired) and the smaller the variability (the amount ofvariation in penetration of several rounds of a specificcaliber), the more likely it is that the results will bestatistically significant if true differences exist among thecalibers tested.
While a researcher usually does not have control over thevariability of the data, it is possible to have some control oversample size. In ammunition selection studies, because of thelabor involved in making gelatin blocks, a sample of five roundsper caliber for several targets is considered quite large. Statistically, however, this is a small sample size and dependingon the variability of the data, differences as large as one inchmay not be statistically significant.
Statistical Procedures for Ammunition Selection Testing
Because various types of designs can be applied to ammunitionselection studies, numerous types of statistical tests can beapplied to the resultant data. The following analyses can beconsidered and discussed with a consulting statistician foradditional advice with a specific project:
1. Descriptive statistics summarizing the number┴┴ of roundsfired, the means, standard deviations, standard errors, 95%
confidence intervals, and minimum and maximum measures can berecorded and displayed in tables;
2. Homogeneity of variance tests can be conducted to identify
significant differences in the variability of the different calibers
tested;
3. Analysis of variance (ANOVA) tests can be conducted toidentify significant mean differences among two or more calibersfor the various targets. If an equal number of rounds isfired for each caliber, ANOVA is the appropriate statistical testsince it is robust to violations of the homogeneity of varianceassumption; and
4. For those ANOVA analyses where significant differences arefound, post hoc comparisons can be calculated to determinesignificant differences between all possible pairs of means forthe different calibers tested in a project.
CONCLUSION
Ammunition selection research projects must be considered in thecontext of the overall difficulty in obtaining bulletperformance data. Despite the best intentions of researchers tocontrol potential bias and extraneous variables, "real world"variables associated with law enforcement combat situations cannever be perfectly simulated.
The research and measurement techniques suggested for ammunitionselection projects are not unique to ammunition selection;indeed, they are widely used in the physical and behavioralsciences. However, techniques of this type infrequently appear inlaw enforcement-related research literature for ammunitiontesting. When more rigorous approaches to research are used,there is much more confidence in the results and theinterpretation of the results. The importance of valid resultscannot be overstated; the lives of law enforcement officersdepend on the results.
Footnote
F.N. Kerlinger, Foundations of Behavioral Research (New York:Holt, Rinehart and Winston, 1984).